rank | frequency | n-gram |
---|---|---|
1 | 3282 | -n |
2 | 1594 | -e |
3 | 1512 | -t |
4 | 1263 | -r |
5 | 643 | -s |
rank | frequency | n-gram |
---|---|---|
1 | 2505 | -en |
2 | 1032 | -er |
3 | 361 | -te |
4 | 279 | -ng |
5 | 240 | -ch |
rank | frequency | n-gram |
---|---|---|
1 | 464 | -ten |
2 | 297 | -hen |
3 | 277 | -gen |
4 | 195 | -ter |
5 | 187 | -ung |
rank | frequency | n-gram |
---|---|---|
1 | 224 | -chen |
2 | 120 | -ngen |
3 | 92 | -sten |
4 | 76 | -eben |
5 | 75 | -cher |
rank | frequency | n-gram |
---|---|---|
1 | 77 | -schen |
2 | 70 | -ungen |
3 | 63 | -chaft |
4 | 60 | -ische |
5 | 46 | -ommen |
The tables show the most frequent letter-N-grams at the ending of words for N=1…5. Everything runs in parallel to 2.2.5 Most frequent word beginnings. The aim is suffix detection instead of affix detection.
For N=3:
SELECT @pos:=(@pos+1), xx.* from (SELECT @pos:=0) r, (select count(*) as cnt ,concat("-", right(word,3)) FROM words WHERE w_id>100 group by right(word,3) order by cnt desc) xx limit 5;
2.2.5 Most frequent word beginnings